k-Nearest neighbor searching in hybrid spaces

نویسندگان

  • Dashiell Kolbe
  • Qiang Zhu
  • Sakti Pramanik
چکیده

Little work has been reported in the literature to support k-nearest neighbor (k-NN) searches/ queries in hybrid data spaces (HDS). An HDS is composed of a combination of continuous and non-ordered discrete dimensions. This combination presents new challenges in data organization and search ordering. In this paper, we present an algorithm for k-NN searches using a stages and use the properties of an HDS to derive a new search heuristic that greatly reduces the number of disk accesses in the initial stage of searching. Further, we present a performance model for our algorithm that estimates the cost of performing such searches. Our experimental results demonstrate the effectiveness of our algorithm and the accuracy of our performance estimation model. & 2014 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Techniques for Efficient K-nearest Neighbor Searching in Non-ordered Discrete and Hybrid Data Spaces

TECHNIQUES FOR EFFICIENT K-NEAREST NEIGHBOR SEARCHING IN NON-ORDERED DISCRETE AND HYBRID DATA SPACES

متن کامل

Non-zero probability of nearest neighbor searching

Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...

متن کامل

Software Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms

A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

A Parallel Algorithms on Nearest Neighbor Search

The (k-)nearest neighbor searching has very high computational costs. The algorithms presented for nearest neighbor search in high dimensional spaces have have suffered from curse of dimensionality, which affects either runtime or storage requirements of the algorithms terribly. Parallelization of nearest neighbor search is a suitable solution for decreasing the workload caused by nearest neigh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2014